Virtual system and method for securing external network connectivity

ABSTRACT

According to one embodiment, a computing device comprises one or more hardware processor and a memory coupled to the one or more processors. The memory comprises software that supports a virtualization software architecture including a first virtual machine operating under control of a first operating system. Responsive to determining that the first operating system has been compromised, a second operating system, which is stored in the memory in an inactive (dormant) state, is now active and controlling the first virtual machine or a second virtual machine different from the first virtual machine that now provides external network connectivity.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is based upon and claims the benefit of priority from U.S. Provisional Patent Application No. 62/187,115 filed Jun. 30, 2015, the entire contents of which are incorporated herein by reference.

FIELD

Embodiments of the disclosure relate to the field of malware detection. More specifically, one embodiment of the disclosure relates to a hypervisor-based, malware detection architecture.

GENERAL BACKGROUND

In general, virtualization is a technique for hosting different guest operating systems concurrently on the same computing platform. With the emergence of hardware support for full virtualization in an increased number of hardware processor architectures, new virtualization software architectures have emerged. One such virtualization architecture involves adding a software abstraction layer, sometimes referred to as a virtualization layer, between the physical hardware and a virtual machine (referred to as “VM”).

A VM is a software abstraction that operates like a physical (real) computing device having a particular operating system. A VM typically features pass-through physical and/or emulated virtual system hardware, and guest system software. The virtual system hardware is implemented by software components in the host (e.g., virtual central processing unit “vCPU” or virtual network interface card “vNIC”) that are configured to operate in a similar manner as corresponding physical components (e.g., physical CPU or NIC). The guest system software comprises a “guest” OS and one or more “guest” applications. Controlling execution and allocation of virtual resources, the guest OS may include an independent instance of an operating system such as WINDOWS® OS, MAC® OS, LINUX® OS or the like. The guest application(s) may include any desired software application type such as a Portable Document Format (PDF) reader (e.g., ACROBAT®), a web browser (e.g., EXPLORER®), a word processing application (e.g., WORD®), or the like.

When we run the virtualization layer on an endpoint device, the guest OS is in control of the pass-through endpoint device hardware, notably the network interface card (NIC). A successful (malicious) attack on the guest OS may allow the attacker to control the guest OS and disable external network connectivity via the guest OS. For instance, if the guest OS crashes based on this attack, the virtualization layer of the endpoint device would have neither an ability to communicate with other external devices nor an ability to provide an alert message to advise an administrator of the occurrence of the malicious attack.

A mechanism is needed to ensure external network connectivity and communications even if the guest OS is compromised or no longer functioning correctly.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the disclosure are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:

FIG. 1A and FIG. 1B are an exemplary block diagram of a system network that may be utilized by a computing device configured to support virtualization with enhanced security.

FIG. 2 is an exemplary block diagram of a logical representation of the endpoint device of FIG. 1.

FIG. 3 is an exemplary embodiment of the virtualization of the endpoint device of FIG. 2 with compromised guest OS detection and OS recovery.

FIG. 4A is an exemplary flowchart of the operations associated a first technique for guest OS evaluation and OS recovery.

FIG. 4B is an exemplary flowchart of the operations associated a second technique for guest OS evaluation and OS recovery.

FIG. 5 is an exemplary flowchart of the operations for detecting loss of network connectivity caused by a non-functional guest OS and conducting an OS recovery response to re-establish network connectivity.

DETAILED DESCRIPTION

Various embodiments of the disclosure are directed to added functionality of the virtualization layer to transition from a first virtual machine with a first (guest) operating system to a second virtual machine with a second (recovery) operating system (OS) in response to the virtualization layer determining that the guest OS is “compromised,” namely the guest OS is not functioning properly due to malicious operations conducted by malware. More specifically, the virtualization layer is configured to determine that the guest OS is “compromised” upon detecting (i) an attempt to disable or actual loss of external network connectivity or (ii) the guest OS is no longer working (non-functional). For example, where the guest OS kernel is responsible for the attempt to disable or loss of external network connectivity (and network connectivity cannot be restored after repeated retries), the virtualization layer considers that the guest OS kernel is compromised as being hijacked or infected with malware. As another example, where the guest OS kernel is non-functional, which may be due to a number of factors including malware crashing the kernel, the virtualization layer considers that the guest OS is compromised. As yet another example, where a guest OS application is no longer working (or working properly), especially if that application is crucial to network connectivity (e.g., a network daemon) or issuing alerts (e.g., an agent), the virtualization layer considers that the guest OS is compromised.

Upon determining that the guest OS is compromised, the first virtual machine is stopped by halting operations of one or more virtual processors (vCPUs) within the first virtual machine. As an optional operation, the state of the first virtual machine (e.g., a snapshot of stored content) may be captured by the virtualization layer. Also, normally retained in a dormant state as an OS image within a memory (e.g., a particular non-transitory storage medium such as main memory or on disk), the recovery OS may be accessed once a decision is made to bootstrap the recovery OS.

According to one embodiment of the disclosure, a second virtual machine may be created by bootstrapping a second OS, namely the recovery OS, which includes the recovery OS kernel and one or more guest OS applications. Thereafter, the recovery OS is now assigned network device resources that were previously assigned to the guest OS. The recovery OS may be a different type of OS than the guest OS. For instance, the guest OS may be a WINDOWS® OS while the recovery OS may be a Linux® OS with a minimal memory, either stored on disk or in memory.

According to one embodiment of the disclosure, after the network device resources have been reassigned to the recovery OS, which is responsive to the virtualization layer detecting that the guest OS is compromised, the second virtual machine undergoes a boot process. The purpose of transitioning from the first virtual machine to the second virtual machine is to provide a clean, uninfected and trustworthy platform environment, given that the second virtual machine was dormant (e.g., pre-boot state and not running) when the malicious attack occurred. After completion of the boot process, the second virtual machine is capable of driving a physical pass-through network adapter (e.g., physical NIC or software-emulated NIC) to establish a network connection to another computing device for reporting one or more detected malicious events that occurred while the first virtual machine was executing. This reporting may include the transmission of an alert in a message format (e.g., a Short Message Service “SMS” message, Extended Message Service “EMS” message, Multimedia Messaging Service “MMS”, Email, etc.) or any other prescribed wired or wireless transmission format. As part of this reporting, a partial state or an entire state of the compromised, guest OS (or a portion thereof) may be stored and subsequently provided for offline forensic analysis.

Alternatively, in lieu of deploying another virtual machine, the recovery OS and its corresponding guest OS application(s) may be installed into the first virtual machine along with removal of the guest OS (guest OS kernel and its corresponding guest OS applications). Although the first virtual machine is reused, for discussion herein, the reconfigured first virtual machine is referred to as a “second” virtual machine. However, in accordance with this embodiment, the state of the first virtual machine prior to installation of the recovery OS should be captured as described above. Otherwise, any previous state of the guest OS would be lost upon installation of the recovery OS.

Herein, the virtualization layer is a logical representation of at least a portion of a host environment of the virtualization for the computing device. The host environment features a light-weight hypervisor (sometimes referred herein as a “micro-hypervisor”) operating at a high privilege level (e.g., ring “0”). In general, the micro-hypervisor operates similar to a host kernel, where the micro-hypervisor at least partially controls the behavior of a virtual machine (VM). Examples of different types of VM behaviors may include the allocation of resources for the VM, scheduling for the VM, which events cause VM exits, or the like. The host environment further features a plurality of software components, generally operating as user-level virtual machine monitors (VMMs), which provide host functionality and operate at a lower privilege level (e.g. privilege ring “3”) than the micro-hypervisor.

In summary, a first virtual machine under control of a guest OS is in operation while an OS image of a recovery OS is dormant and resides in a particular location of a non-transitory storage medium. In response to a decision by the virtualization layer to bootstrap the recovery OS, normally upon the occurrence of a prescribed event (e.g., loss of network connectivity caused by a compromised or malfunctioning of the guest OS kernel) as described, a second virtual machine is created under control of the recovery OS. Alternatively, in response to a decision by the virtualization layer to substitute the guest OS within the first virtual machine for the recovery OS, the reconfigured virtual machine, also referred to as the “second virtual machine” is created under control of the recovery OS. However, the state of the first virtual machine, notably the guest OS, should be captured prior to substitution of the recovery OS to avoid loss of the state of the guest OS at the time it was compromised.

1. Terminology

In the following description, certain terminology is used to describe features of the invention. For example, in certain situations, the terms “component” and “logic” are representative of hardware, firmware, software or a running process that is configured to perform one or more functions. As hardware, a component (or logic) may include circuitry having data processing or storage functionality. Examples of such circuitry may include, but are not limited or restricted to a hardware processor (e.g., microprocessor with one or more processor cores, a digital signal processor, a programmable gate array, a microcontroller, an application specific integrated circuit “ASIC”, etc.), a semiconductor memory, or combinatorial elements.

A component (or logic) may be software in the form of one or more software modules, such as executable code in the form of an executable application, an API, a subroutine, a function, a procedure, an applet, a servlet, a routine, source code, object code, a shared library/dynamic load library, or one or more instructions. Each or any of these software components may be stored in any type of a suitable non-transitory storage medium, or transitory storage medium (e.g., electrical, optical, acoustical or other form of propagated signals such as carrier waves, infrared signals, or digital signals). Examples of non-transitory storage medium may include, but are not limited or restricted to a programmable circuit; semiconductor memory; non-persistent storage such as volatile memory (e.g., any type of random access memory “RAM”); or persistent storage such as non-volatile memory (e.g., read-only memory “ROM”, power-backed RAM, flash memory, phase-change memory, etc.), a solid-state drive, hard disk drive, an optical disc drive, or a portable memory device. As firmware, the executable code may be stored in persistent storage.

The term “object” generally refers to a collection of data, whether in transit (e.g., over a network) or at rest (e.g., stored), often having a logical structure or organization that enables it to be classified for purposes of analysis for malware. During analysis, for example, the object may exhibit certain expected characteristics (e.g., expected internal content such as bit patterns, data structures, etc.) and, during processing, a set of expected behaviors. The object may also exhibit unexpected characteristics and a set of unexpected behaviors that may offer evidence of the presence of malware and potentially allow the object to be classified as part of a malicious attack.

Examples of objects may include one or more flows or a self-contained element within a flow itself. A “flow” generally refers to related packets that are received, transmitted, or exchanged within a communication session. For convenience, a packet is broadly referred to as a series of bits or bytes having a prescribed format, which may, according to one embodiment, include packets, frames, or cells. Further, an “object” may also refer to individual or a number of packets carrying related payloads, e.g., a single webpage received over a network. Moreover, an object may be a file retrieved from a storage location over an interconnect.

As a self-contained element, the object may be an executable (e.g., an application, program, segment of code, dynamically link library “DLL”, etc.) or a non-executable. Examples of non-executables may include a document (e.g., a Portable Document Format “PDF” document, Microsoft® Office® document, Microsoft® Excel® spreadsheet, etc.), an electronic mail (email), downloaded web page, or the like.

The term “event” should be generally construed as an activity that is conducted by a software component running on the computing device. The event may occur that causes an undesired action to occur, such as overwriting a buffer, disabling a certain protective feature in the guest environment, or a guest OS anomaly such as the guest OS kernel trying to execute from a user page. Generically, an object or event may be referred to as “data under analysis”.

The term “computing device” should be construed as electronics with data processing capability and/or a capability of connecting to any type of network, such as a public network (e.g., Internet), a private network (e.g., a wireless data telecommunication network, a local area network “LAN”, etc.), or a combination of networks. Examples of a computing device may include, but are not limited or restricted to, the following: an endpoint device (e.g., a laptop, a smartphone, a tablet, a desktop computer, a netbook, a medical device, or any general-purpose or special-purpose, user-controlled electronic device configured to support virtualization); a server; a mainframe; a router; or a security appliance that includes any system or subsystem configured to perform functions associated with malware detection and may be communicatively coupled to a network to intercept data routed to or from an endpoint device.

The term “malware” may be broadly construed as information, in the form of software, data, or one or more commands, that are intended to cause an undesired behavior upon execution, where the behavior is deemed to be “undesired” based on customer-specific rules, manufacturer-based rules, and any other type of rules formulated by public opinion or a particular governmental or commercial entity. This undesired behavior may include a communication-based anomaly or an execution-based anomaly that would (1) alter the functionality of an electronic device executing an application software in a malicious manner; (2) alter the functionality of an electronic device executing that application software without any malicious intent; and/or (3) provide an unwanted functionality which is generally acceptable in other context.

The term “interconnect” may be construed as a physical or logical communication path between two or more computing platforms. For instance, the communication path may include wired and/or wireless transmission mediums. Examples of wired and/or wireless transmission mediums may include electrical wiring, optical fiber, cable, bus trace, a radio unit that supports radio frequency (RF) signaling, or any other wired/wireless signal transfer mechanism.

The term “computerized” generally represents that any corresponding operations are conducted by hardware in combination with software and/or firmware. Also, the term “agent” should be interpreted as a software component that instantiates a process running in a virtual machine. The agent may be instrumented into part of an operating system (e.g., guest OS) or part of an application (e.g., guest software application). The agent is configured to provide metadata to a portion of the virtualization layer, namely software that virtualizes certain functionality supported by the computing device.

Lastly, the terms “or” and “and/or” as used herein are to be interpreted as inclusive or meaning any one or any combination. Therefore, “A, B or C” or “A, B and/or C” mean “any of the following: A; B; C; A and B; A and C; B and C; A, B and C.” An exception to this definition will occur only when a combination of elements, functions, steps or acts are in some way inherently mutually exclusive.

II. General Architecture

Referring to FIG. 1A, an exemplary block diagram of a system network 100 that may be utilized by a computing device configured to support virtualization with enhanced security is described herein. The system network 100 may be organized as a plurality of networks, such as a public network 110 and/or a private network 120 (e.g., an organization or enterprise network). According to this embodiment of system network 100, the public network 110 and the private network 120 are communicatively coupled via network interconnects 130 and intermediary computing devices 140 ₁, such as network switches, routers and/or one or more malware detection system (MDS) appliances (e.g., intermediary computing device 140 ₂) as described in co-pending U.S. patent application entitled “Microvisor-Based Malware Detection Appliance Architecture” (U.S. patent application Ser. No. 14/962,497), the entire contents of which are incorporated herein by reference. The network interconnects 130 and intermediary computing devices 140 ₁, inter alia, provide connectivity between the private network 120 and a computing device 140 ₃, which may be operating as an endpoint device for example.

The computing devices 140 ₁ (i=1, 2, 3) illustratively communicate by exchanging messages (e.g., packets or other data in a prescribed format) according to a predefined set of protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP). However, it should be noted that other protocols, such as the HyperText Transfer Protocol Secure (HTTPS) for example, may be advantageously used with the inventive aspects described herein. In the case of private network 120, the intermediary computing device 140 ₁ may include a firewall or other computing device configured to limit or block certain network traffic in an attempt to protect the endpoint devices 140 ₃ from unauthorized users.

As illustrated in FIG. 1B in greater detail, the endpoint device 140 ₃ supports a virtualization software architecture 150 that comprises a guest environment 160 and a host environment 180. As shown, the guest environment 160 comprises one or more virtual machines. As shown, a first virtual machine (VM1) 170 comprises a guest operating system (OS) 300, which includes a guest OS kernel. Residing within the first virtual machine 170 (e.g., within the guest OS 300 or another software component within the first virtual machine 170 or the guest environment 160), a certain component, which are sometimes referred to as a “guest agent” 172, may be configured to monitor and store metadata (e.g., state information, memory accesses, process names, etc.) and subsequently provide the metadata to a virtualization layer 185 within the host environment 180.

A second virtual machine (VM2) 175 comprises a recovery OS 310, which includes a recovery OS kernel 311 and one or more guest OS applications 312 (e.g., applications to configure or bring up the network interface such as a DHCP client, applications for copying files from one machine to another, etc.). The second virtual machine 175 resides in a dormant (non-boot) state until its boot process is initiated by the virtualization layer 185. Although the virtualization layer provides memory isolation between software components, in order to further mitigate a spread of infection of malware already infecting a network device, it is contemplated that the recovery OS 310 may be deployed with an OS type different than the OS type of the guest OS 300. In particular, if the guest OS 300 has been compromised due to an exploitable software vulnerability, it is not desired to provide the recovery OS 310 with the same vulnerability. For example, where the guest OS 300 within the first virtual machine 170 is a WINDOWS® operating system or an IOS® operating system, the recovery OS 310 within the second virtual machine 175 (as shown) may be configured as a version of the LINUX® operating system or the ANDROID® operating system for example. It is contemplated that the second virtual machine 175 may be shown as a logical representation, where the second virtual machine 175 is in fact the first virtual machine 170, where the guest OS 300 (guest OS kernel and corresponding guest applications—not shown) are replaced by the recovery OS 310.

The virtualization layer 185 features (i) a micro-hypervisor 360 (shown in FIG. 3) with access to physical hardware 190 and (ii) one or more host applications running in the user space (not shown). Both the micro-hypervisor and the host applications operate in concert to provide additional functionality by controlling configuration of the second virtual machine 175, including activation or deactivation of the first virtual machine 170 in response to detection of events associated with anomalous behaviors indicating the guest OS 300 has been compromised. This additional functionality ensures external network connectivity is available to the endpoint device 140 ₃, even when the guest OS 300 is non-functional, potentially hijacked by malware.

Referring now to FIG. 2, an exemplary block diagram of a representation of the endpoint device 140 ₃ is shown. Herein, the endpoint device 140 ₃ illustratively includes at least one hardware processor 210, a memory 220, one or more network interfaces (referred to as “network interface(s)”) 230, one or more network devices (referred to as “network device(s)”) 240 communicatively coupled by a system interconnect 260, such as a bus. These components are at least partially encased in a housing 200, which is made entirely or partially of a rigid material (e.g., hardened plastic, metal, glass, composite, or any combination thereof) that protects these components from atmospheric conditions.

It is contemplated that some or all of the network device(s) 240 may be coupled to the system interconnect 260 via an Input/Output Memory Management Unit (IOMMU) 250. The IOMMU 250 provides direct memory access (DMA) management capabilities for direct access of data within the memory 220. According to one embodiment of the disclosure, in response to signaling from a component within the virtualization layer 185 (e.g., micro-hypervisor 360 or one of the hyper-processes 370), the IOMMU 250 may be configured to assign (or re-assign) network devices to a particular OS. For instance, when re-configured by the virtualization layer 185, the IOMMU 250 may re-assign some or all of the network device(s) 240 from the guest OS 300 to the recovery OS 310 and one of the hyper-processes 370, namely the guest monitor component 376, may reassign all PCI device (e.g., memory-mapped input/output “MMIO” or I/O) resources of a network adapter from one virtual machine to another, as described below.

The hardware processor 210 is a multipurpose, programmable device that accepts digital data as input, processes the input data according to instructions stored in its memory, and provides results as output. One example of the hardware processor 210 may include an Intel® x86 central processing unit (CPU) with an instruction set architecture. Alternatively, the hardware processor 210 may include another type of CPU, a digital signal processor (DSP), an ASIC, or the like.

The network device(s) 240 may include various input/output (I/O) or peripheral devices, such as a storage device for example. One type of storage device may include a solid state drive (SSD) embodied as a flash storage device or other non-volatile, solid-state electronic device (e.g., drives based on storage class memory components). Another type of storage device may include a hard disk drive (HDD).

Each network interface 230 may include one or more network ports containing the mechanical, electrical and/or signaling circuitry needed to connect the endpoint device 140 ₃ to the network 120 of FIG. 1 to thereby facilitate communications over the system network 110. To that end, the network interface(s) 230 may be configured to transmit and/or receive messages using a variety of communication protocols including, inter alia, TCP/IP and HTTPS.

The memory 220 may include a plurality of locations that are addressable by the hardware processor 210 and the network interface(s) 230 for storing software (including software applications) and data structures associated with such software. The hardware processor 210 is adapted to manipulate the stored data structures as well as execute the stored software, which includes the guest OS 300, the recovery OS 310, user mode processes 320, a micro-hypervisor 360 and hyper-processes 370.

Herein, the hyper-processes 370 may include instances of software program code (e.g., user-space applications operating as user-level VMMs) that are isolated from each other and run on separate address spaces. In communication with the micro-hypervisor 360, the hyper-processes 370 are responsible for controlling operability of the endpoint device 140 ₃, including policy and resource allocation decisions, maintaining logs of monitored events for subsequent analysis, managing virtual machine (VM) execution, and managing malware detection and classification.

The micro-hypervisor 360 is disposed or layered beneath both the guest OS kernel 301 and/or the recovery OS kernel 311 of the endpoint device 140 ₃ and is the only component that runs in the most privileged processor mode (host mode, ring-0). As part of a trusted computing base of most components in the computing platform, the micro-hypervisor 360 is configured as a light-weight hypervisor (e.g., less than 10K lines of code), thereby avoiding inclusion of potentially exploitable virtualization code in an operating system (e.g. x86 virtualization code).

The micro-hypervisor 360 generally operates as the host kernel that is devoid of policy enforcement; rather, the micro-hypervisor 360 provides a plurality of mechanisms that may be used by the hyper-processes 370 for controlling operability of the virtualization. These mechanisms may be configured to control communications between separate protection domains (e.g., between two different hyper-processes 370), coordinate thread processing within the hyper-processes 370 and virtual CPU (vCPU) processing within the VM1 170 or VM2 175, delegate and/or revoke hardware resources, and control interrupt delivery and DMA, as described below.

The guest OS 300, portions of which are resident in memory 220 and executed by the hardware processor 210, functionally organizes the endpoint device 140 ₃ by, inter alia, invoking operations that support guest applications executing on the endpoint device 140 ₃. An exemplary guest OS 300 may include a version of the WINDOWS® operating systems, a version of a MAC OS® and IOS® series of operating systems, a version of the LINUX® operating system or a version of the ANDROID® operating system, among others.

The recovery OS 310, portions of which are resident in memory 220 and executed by the hardware processor 210, functionally organizes the endpoint device 140 ₃ by, inter alia, invoking operations to at least drive one or more network adapters that provide external network connectivity by establishing one or more external communication channels with one or more other computing devices. Examples of a network adapter may include, but are not limited or restricted to physical or software-emulated data transfer devices such as a network interface card (NIC), a modem, a wireless chipset that supports radio (e.g., radio frequency “RF” signals such as IEEE 802.11 based communications) or supports cellular transmissions, or light emitting device that produces light pulses for communications. It is contemplated that credentials for wireless network access may either be pre-configured in the recovery OS 310, or the guest agent 172 in the guest OS 300 (which is virtualization aware) can send those credentials via the virtualization layer to the recovery OS 310. For wireless network connectivity, the credentials may include Service Set Identifier (SSID), one or more pre-shared keys, or the like.

Herein, in order to avoid malware from a compromised guest OS 300 of the first virtual machine 170 from potentially infecting the recovery OS 310 within the second virtual machine 175, the recovery OS 310 may feature an operating system type different from the guest OS 300 to not suffer the same vulnerabilities that could be exploited. Also, as configured, the recovery OS 310 of the second virtual machine 175 may be configured to support lesser functionality than the guest OS 300, such as the recovery OS 310 may be configured to only drive a small subset (e.g., less than ten) of the network devices than the number of network devices supported by guest OS 300. Although the second virtual machine 175 is shown, for illustrative purposes, as being separate from the first virtual machine 170, it is contemplated that the second virtual machine 175 may be a different virtual machine or a reconfiguration of the first virtual machine 170 with the recovery OS 310 in lieu of the guest OS 300. For the later, the first virtual machine 170 is reused by deleting the compromised OS and substituting for the recovery OS 310

Running on top of the guest OS kernel 301, some of the user mode processes 320 constitute instances of guest OS applications 302 and/or guest applications 322 running in their separate address space. As an example, one of the guest application processes 322 running on top of the guest OS kernel 301 may include ADOBE® READER® from Adobe Systems Inc. of San Jose, Calif. or MICROSOFT® WORD® from Microsoft Corporation of Redmond, Wash. Events (monitored behaviors) of an object that is processed by one of the user mode processes 320 are monitored by a guest agent process 172, which provides metadata to at least one of the hyper-processes 370 and the micro-hypervisor 360 for use in malware detection. Hence, as shown, the object and associated events may be analyzed for the presence of malware; however, it is contemplated that the analytical functionality provided by the different malware detection processes could be provided by different malware detection modules/drivers (not shown) in the guest OS kernel 301. For such deployment, a guest OS anomaly may be detected.

III. Virtualization Software Architecture

Referring now to FIG. 3, an exemplary embodiment of the virtualization software architecture 150 of the endpoint device 140 ₃ with compromised guest OS detection and OS recovery is shown. The virtualization software architecture 150 comprises guest environment 160 and host environment 180, both of which may be configured in accordance with a protection ring architecture as shown. While the protection ring architecture is shown for illustrative purposes, it is contemplated that other architectures that establish hierarchical privilege levels for virtualization software components may be utilized.

A. Guest Environment

As shown, the guest environment 160 comprises the first virtual machine 170, which is adapted to analyze an object 335 and/or events produced during execution of the first virtual machine 170 (hereinafter generally referred to as “data for analysis”) for the presence of malware. As shown, the first virtual machine 170 features a guest OS 300 that features a guest OS kernel 301 that is running in the most privileged level (Ring-0 305) along with one or more processes 320, which may include one or more instances of guest OS applications 302 and/or one or more instances of software applications 322 (hereinafter “guest application process(es)”). Running in a lesser privileged level (Ring-3 325), the guest application process(es) 322 may be based on the same software application, different versions of the same software application, or even different software applications, provided the guest software process(es) 322 may be controlled by the same guest OS kernel 301 (e.g., WINDOWS® OS kernel).

It is contemplated that malware detection on the endpoint device 140 ₃ may be conducted by one or more processes embodied as software components (e.g., guest OS application(s)) running with the first virtual machine 170. These processes include a static analysis process 330, a heuristics process 332 and a dynamic analysis process 334, which collectively operate to detect suspicious and/or malicious behaviors by the object 335 that occur during execution within the first virtual machine 170. Notably, the endpoint device 140 ₃ may perform (implement) malware detection as background processing (i.e., minor use of endpoint resources) with data processing being implemented as its primary processing (e.g., in the foreground having majority use of endpoint resources).

As used herein, the object 335 may include, for example, a web page, email, email attachment, file or universal resource locator. Static analysis may conduct a brief examination of characteristics (internal content) of the object 335 to determine whether it is suspicious, while dynamic analysis may analyze behaviors associated with events that occur during virtual execution of the object 335, especially characteristics involving a network adapter such as a physical pass-through network interface card (NIC) (hereinafter “network adapter”) 304. For instance, a loss of network connectivity can be determined in a number of ways. For instance, the guest agent 172 may initiate keepalive network packets and the failure to receive responses to these packets may denote loss of network connectivity. Additionally or in the alternative, the virtualization layer detects that the network adapter is not working based on a lack of network interrupts, or statistical registers in the network adapter that identify the number of bytes sent/received is below a prescribed threshold.

According to one embodiment of the disclosure, the static analysis process 330 and the heuristics process 332 may conduct a first examination of the object 335 to determine whether any characteristics of the object are suspicious and/or malicious. A finding of “suspicious” denotes that the characteristics signify a first probability range of the analyzed object 335 being malicious while a finding of “malicious” denotes that the characteristics signify a higher, second probability of the analyzed object 335 being malicious.

The static analysis process 330 and the heuristics process 332 may employ statistical analysis techniques, including the use of vulnerability/exploit signatures and heuristics, to perform non-behavioral analysis in order to detect anomalous characteristics (i.e., suspiciousness and/or maliciousness) without execution (i.e., monitoring run-time behavior) of the object 335. For example, the static analysis process 330 may employ signatures (referred to as vulnerability or exploit “indicators”) to match content (e.g., bit patterns) of the object 335 with patterns of the indicators in order to gather information that may be indicative of suspiciousness and/or malware. The heuristics module 332 may apply rules and/or policies to detect anomalous characteristics of the object 335 in order to identify whether the object 335 is suspect and deserving of further analysis or whether it is non-suspect (i.e., benign) and not in need of further analysis. These statistical analysis techniques may produce static analysis results (e.g., identification of communication protocol anomalies and/or suspect source addresses of known malicious servers) that may be provided to a reporting module 336.

More specifically, the static analysis process 330 may be configured to compare a bit pattern of the object 335 content with a “blacklist” of suspicious exploit indicator patterns. For example, a simple indicator check (e.g., hash) against the hashes of the blacklist (i.e., exploit indicators of objects deemed suspicious) may reveal a match, where a score may be subsequently generated (based on the content) by the threat protection component 376 to identify that the object may include malware. In addition to or in the alternative of a blacklist of suspicious objects, bit patterns of the object 335 may be compared with a “whitelist” of permitted bit patterns.

The dynamic analysis process 334 may conduct an analysis of the object 335 during its processing, where the guest agent process 172 monitors the run-time behaviors of the object 335 and captures certain type of events that occur during run time. The events are stored within a ring buffer 340 of the guest agent 172 for possible subsequent analysis by the threat protection component 376, as described below. In an embodiment, the dynamic analysis process 334 normally operates concurrently (e.g., at least partially at the same time) with the static analysis process 330 and/or the heuristics process 332. During processing of the object 335, particular events may be hooked to trigger signaling (and the transfer of data) to the host environment 180 for further analysis by the threat protection component 376 and/or master controller component 372.

For instance, the dynamic analysis process 334 may examine whether any behaviors associated with a detected event that occur during processing of the analyzed object 335 are suspicious and/or malicious. One of these detected events may pertain to activities with the network adapter 304 or any activities that are directed to altering a current operating state of the network adapter 304. A finding of “suspicious” denotes that the behaviors signify a first probability range of the analyzed object 335 being associated with malware while a finding of “malicious” denotes that the behaviors signify a higher second probability of the analyzed object 335 being associated with malware. The dynamic analysis results (and/or events caused by the processing of the object 335 and/or object itself) may also be provided to reporting module 336.

Based on the static analysis results and/or the dynamic analysis results, the reporting module 336 may be configured to generate a report (result data in a particular format) or an alert (message advising of the detection suspicious or malicious events) for transmission via network adapter 314 to a remotely located computing device, such as MDS 140 ₂ or another type of computing device.

In addition or in lieu of analysis of the object 335, it is contemplated that the presence of a guest OS anomaly, which may be detected by malware detection processes 302 or malware detection modules/drivers 345 in the guest OS kernel 301, may be detected and reported to the host environment 180 (e.g., guest monitor component 374 and/or threat protection component 376) and/or reporting module 336).

1. Guest OS

In general, the guest OS 300 manages certain operability of the first virtual machine 170, where some of these operations are directed to the execution and allocation of virtual resources involving network connectivity, memory translation, or driving of one or more network devices including a network adapter. More specifically, the guest OS 300 may receive an input/output (I/O) request from the object 335 being processed by one or more guest software process(es) 322, and in some cases, translates the I/O request into instructions. These instructions may be used, at least in part, by virtual system hardware (e.g., vCPU 303), to drive the network adapter 304 for establishing network communications with other network devices. Upon establishing connectivity with the private network 120 and/or the public network 110 of FIG. 1 and in response to detection that the object 335 is malicious, the endpoint device 140 ₃ may initiate an alert messages via reporting module 336 and the network adapter 304. Alternatively, with network connectivity, the guest OS 300 may receive software updates from administrators via the private network 120 of FIG. 1 or from a third party provider via the public network 110 of FIG. 1.

2. Guest Agent

According to one embodiment of the disclosure, the guest agent 172 is a software component configured to provide the virtualization layer 185 with metadata that may assist in the handling of malware detection. Instrumented into either a guest software application 320 (as shown), a portion of the guest OS 300 or operating as a separate module, the guest agent 172 is configured to provide metadata to the virtualization layer 185 in response to at least one selected event.

Herein, the guest agent 172 comprises one or more ring buffers 340 (e.g., queue, FIFO, shared memory, and/or registers), which records certain events that may be considered of interest for malware detection. Examples of these events may include information associated with a newly created process (e.g., process identifier, time of creation, originating source for creation of the new process, etc.), information associated with an access to certain restricted port or memory address, or the like. The recovery of the information associated with the stored events may occur through a “pull” or “push” recovery scheme, where the guest agent 172 may be configured to download the metadata periodically or aperiodically (e.g., when the ring buffer 340 exceeds a certain storage level or in response to a request). The request may originate from the threat protection component 376 and is generated by the guest monitor component 374.

3. Recovery OS

When dormant, the recovery OS 310 is stored as an OS image, the recovery OS 310 manages operability of the second virtual machine 175, most notably its network connectivity. More specifically, the second virtual machine 175 with the recovery OS 310 transitions from its normal dormant (inactive) state into an active state in response to the virtualization layer 185 determining that the guest OS 300 of the first virtual machine 170 has been compromised. Prior to, contemporaneously with, or after activation of the second virtual machine 175, the first virtual machine 170 transitions from an active state to an inactive state when the second virtual machine 175 is deployed as a separate virtual machine. When the recovery OS 310 is substituted for the guest OS 300 within the first virtual machine 170, the second virtual machine 175 is effectively the reconfigured first virtual machine 170.

The above-described transitions are conducted to provide the endpoint device 140 ₃ with external network connectivity that includes one or more external communication channels (via the network adapter 314) to a remotely located (external) computing device. The external communication channel allows for transmission of an alert message from the reporting module 336 to an external computing device to denote the detection of a malicious attack. Alternatively, with network connectivity, the recovery OS 310 may receive software updates (e.g., patches, an updated version, etc.) from an administrator via the private network 120 of FIG. 1 or from a third party provider via the public network 110 of FIG. 1. The recovery OS 310 may be used for all forms of investigative analysis of the guest OS 300 as well as for remediation of any issues related to the (compromised) guest OS 300. As an example, the network connectivity of the recovery OS 310 may be used by a remote party to perform over-the-network forensic analysis of a host that has been reported as compromised or malfunctioning or to send off the state/image of the compromised VM 170 across the network for remote analysis.

B. Host Environment

As further shown in FIG. 3, the host environment 180 features a protection ring architecture that is arranged with a privilege hierarchy from the most privileged level 350 (Ring-0) to a lesser privilege level 352 (Ring-3). Positioned at the most privileged level 350 (Ring-0), the micro-hypervisor 360 is configured to directly interact with the physical hardware, such as hardware processor 210 or memory 220 of FIG. 2.

Running on top of the micro-hypervisor 360 in Ring-3 352, a plurality of processes being instances of host applications (referred to as “hyper-processes” 370) communicate with the micro-hypervisor 360. Some of these hyper-processes 370 may include master controller component 372, guest monitor component 374 and threat protection component 376. Each of these hyper-processes 372, 374 and 376 represents a separate software instance with different functionality and is running in a separate address space. As these hyper-processes 370 are isolated from each other (i.e. not in the same binary), inter-process communications between the hyper-processes 370 are handled by the micro-hypervisor 360, but regulated through policy protection by the master controller component 372.

1. Micro-Hypervisor

The micro-hypervisor 360 may be configured as a light-weight hypervisor (e.g., less than 10K lines of code) that operates as a “host” OS kernel. The micro-hypervisor 360 features logic (mechanisms) for controlling operability of the computing device, such as endpoint device 140 ₃ as shown. The mechanisms include inter-process communication (IPC) logic 362, resource allocation logic 364 and scheduling logic 366, where all of these mechanisms are based, at least in part, on a plurality of kernel features—protection domains, execution contexts, scheduling contexts, portals, and semaphores (hereinafter collectively as “kernel features 368”) as partially described in a co-pending U.S. patent application entitled “Microvisor-Based Malware Detection Endpoint Architecture” (U.S. patent application Ser. No. 14/929,821), the entire contents of which are incorporated herein by reference.

More specifically, a first kernel feature is referred to as “protection domains,” which correspond to containers where certain resources for the hyper-processes 370 can be assigned, such as various data structures (e.g., execution contexts, scheduling contexts, etc.). Given that each hyper-process 370 corresponds to a different protection domain, a first hyper-process (e.g., master controller component 372) is spatially isolated from a second (different) hyper-process (e.g., guest monitor component 374). Furthermore, the first hyper-process would be spatially isolated (within the address space) from the first and second virtual machines 170 and 175 as well.

A second kernel feature is referred to as an “execution context,” which features thread level activities within one of the hyper-processes (e.g., master controller component 372). These activities may include, inter alia, (i) contents of hardware registers, (ii) pointers/values on a stack, (iii) a program counter, and/or (iv) allocation of memory via, e.g., memory pages. The execution context is thus a static view of the state of a thread of execution.

Accordingly, the thread executes within a protection domain associated with that hyper-process of which the thread is a part. For the thread to execute on a hardware processor 210, its execution context may be tightly linked to a scheduling context (third kernel feature), which may be configured to provide information for scheduling the execution context for execution on the hardware processor 210. Illustratively, the scheduling context may include a priority and a quantum time for execution of its linked execution context on the hardware processor 210.

Hence, besides the spatial isolation provided by protection domains, the micro-hypervisor 360 enforces temporal separation through the scheduling context, which is used for scheduling the processing of the execution context as described above. Such scheduling by the micro-hypervisor 360 may involve defining which hardware processor may process the execution context (in a multi-processor environment), what priority is assigned the execution priority, and the duration of such execution.

Communications between protection domains are governed by portals, which represent a fourth kernel feature that is relied upon for generation of the IPC logic 362. Each portal represents a dedicated entry point into a corresponding protection domain. As a result, if one protection domain creates the portal, another protection domain may be configured to call the portal and establish a cross-domain communication channel.

Lastly, of the kernel features, semaphores facilitate synchronization between execution context on the same or on different hardware processors. The micro-hypervisor 360 uses the semaphores to signal the occurrence of hardware interrupts to the user applications.

The micro-hypervisor 360 utilizes one or more of these kernel features to formulate mechanisms for controlling operability of the endpoint device 200. One of these mechanisms is the IPC logic 362, which supports communications between separate protection domains (e.g., between two different hyper-processes 370). Thus, under the control of the IPC logic 362, in order for a first software component to communicate with another software component, the first software component needs to route a message to the micro-hypervisor 360. In response, the micro-hypervisor 360 switches from a first protection domain (e.g., first hyper-process 372) to a second protection domain (e.g., second hyper-process 374) and copies the message from an address space associated with the first hyper-process 372 to a different address space associated with the second hyper-process 374.

Another mechanism provided by the micro-hypervisor 360 is resource allocation logic 364. The resource allocation logic 364 enables a first software component to share one or more memory pages with a second software component under the control of the micro-hypervisor 360. Being aware of the location of one or more memory pages, the micro-hypervisor 360 provides the protection domain associated with the second software component access to the memory location(s) associated with the one or more memory pages.

Also, the micro-hypervisor 360 contains scheduling logic 366 that, when invoked, selects the highest-priority scheduling context and dispatches the execution context associated with the scheduling context. As a result, the scheduling logic 366 ensures that, at some point in time, all of the software components can run on the hardware processor 210 as defined by the scheduling context. Also, the scheduling logic 366 re-enforces that no component can monopolize the hardware processor 210 longer than defined by the scheduling context.

2. Master Controller

Referring still to FIG. 3, generally operating as a root task, the master controller component 372 is responsible for enforcing policy rules directed to operations of the virtualization software architecture 150. This responsibility is in contrast to the micro-hypervisor 360, which provides mechanisms for inter-process communications and resource allocation, but does not dictate how and when such functions occur.

Herein, the master controller component 372 may be configured with a policy engine 380 to conduct a number of policy decisions, including some or all of the following: (1) memory allocation (e.g., distinct physical address space assigned to different software components); (2) execution time allotment (e.g., scheduling and duration of execution time allotted on a selected process basis); (3) virtual machine creation (e.g., number of VMs, OS type, etc.); (4) inter-process communications (e.g., which processes are permitted to communicate with which processes, etc.); and/or (5) network device reallocation to the second virtual machine 175 with the recovery OS 310 in response to detecting that the current guest OS 300 has been compromised.

Additionally, the master controller component 372 is responsible for the allocation of resources. Initially, the master controller component 372 receives access to most of the physical resources, except for access to security critical resources that should be driven by high privileged (Ring-0) components, not user space (Ring-3) software components such as hyper-processes 370. For instance, while precluded from access to the memory management unit (MNU) or the interrupt controller, the master controller component 372 may be configured with OS evaluation logic 385, which is adapted to control the selection of which software components are responsible for driving which network devices. For instance, the master controller component 372 may reconfigure the IOMMU 250 of FIG. 2 so that (i) the vCPU 303 of the first virtual machine 170 is halted, (ii) some or all of the network devices 240 that were communicatively coupled to (and driven by) the guest OS kernel 301 are now under control of the recovery OS kernel 310, including a network adapter 314 that is part of the recovery OS 310, and (iii) the vCPU 313 of the second virtual machine 175 is activated from a previously dormant state, where the operability of the second virtual machine 175 is controlled by the recovery OS 310.

The master controller component 372 is platform agnostic. Thus, the master controller component 372 may be configured to enumerate what hardware is available to a particular process (or software component) and to configure the state of the hardware (e.g., activate, place into sleep state, etc.).

By separating the master controller component 372 from the micro-hypervisor 360, a number of benefits are achieved. One inherent benefit is increased security. When the functionality is placed into a single binary, which is running in host mode, any vulnerability may place the entire computing device at risk. In contrast, each of the software components within the host mode is running in its own separate address space.

3. Guest Monitor

Referring still to FIG. 3, the guest monitor component 374 is an instance of a user space application that is responsible for managing the execution of the first virtual machine 170 and/or the second virtual machine 175. Such management includes operating in concert with the threat protection component 376 to determine whether or not certain events, detected by the guest monitor component 374 during processing of the object 335 within the VM 170, are malicious.

In response to receiving one or more events from the guest agent 172 that are directed to the network adapter 304, the guest monitor component 374 determines whether any events are directed to disabling or disrupting operations of the network adapter 304. Data associated with the events is forwarded to the threat protection component 376. Based on this data, the threat protection component 376 may determine if the events denote that the guest OS 300 is compromised and the events further suggest that malware is within the guest OS 300 and is attempting to disrupt or has disabled external network communications for the endpoint device 140 ₃.

Although the IOMMU 250 of FIG. 2 may be responsible for reassigning control of network devices among different OS, notably control of the network adapter, according to one embodiment of the disclosure, it is contemplated that the policy engine 380 of the guest monitor component 374 may be configured to handle reassignment of network device controls in addition to contain lateral movement of the malware such as halting operability of the compromised, guest OS 300 and initiating activation of the second virtual machine 175 with the recovery OS 310. Of course, other components within the virtualization layer 185 may be configured to handle (or assist) in the shift of operability from the first virtual machine 170 with the guest OS 300 to the second virtual machine 175 with the recovery OS 310.

4. Threat Protection Component

As described above and shown in FIG. 3, detection of a suspicious and/or malicious object 335 may be performed by static and dynamic analysis of the object 335 within the first virtual machine 170. Events associated with the process are monitored and stored by the guest agent process 172. Operating in concert with the guest agent process 172, the threat protection component 376 is responsible for further malware detection on the endpoint device 140 ₃ based on an analysis of events received from the guest agent process 172 running in the first virtual machine 170. It is contemplated, however, that detection of suspicious/malicious activity may also be conducted completely outside the guest environment 160, such as solely within the threat protection logic 376 of the host environment 180. The threat protection logic 376 relies on an interaction with the guest agent process 172 when it needs to receive semantic information from inside the guest OS that the host environment 180 could not otherwise obtain itself.

After analysis, the detected events are correlated and classified as benign (i.e., determination of the analyzed object 335 being malicious is less than a first level of probability); suspicious (i.e., determination of the analyzed object 335 being malicious is between the first level and a second level of probability); or malicious (i.e., determination of the analyzed object 335 being malicious is greater than the second level of probability). The correlation and classification operations may be accomplished by a behavioral analysis logic 390 and a classifier 395. The behavioral analysis logic 390 and classifier 395 may cooperate to analyze and classify certain observed behaviors of the object (based on events) as indicative of malware.

In particular, the observed run-time behaviors by the guest agent 172 are provided to the behavioral analysis logic 390 as dynamic analysis results. These events may include commands that may be construed as disrupting or disabling operability of the network adapter, which may be hooked (intercepted) for handling by the virtualization layer 185. As a result, the guest monitor component 374 receives data associated with the events from the guest agent 172 and routes the same to the threat protection component 376.

At this time, the static analysis results and dynamic analysis results may be stored in memory 220, along with any additional data from the guest agent 172. These results may be provided via coordinated IPC-based communications to the behavioral analysis logic 390, which may provide correlation information to the classifier 395. Additionally, or in the alternative, the results and/or events may be provided or attempted to be reported via a network device initiated by the guest OS kernel to the MDS 140 ₂ for correlation. The behavioral analysis logic 390 may be embodied as a rules-based correlation engine illustratively executing as an isolated process (software component) that communicates with the guest environment 160 via the guest monitor component 374.

In an embodiment, the behavioral analysis logic 390 may be configured to operate on correlation rules that define, among other things, patterns (e.g., sequences) of known malicious events (if-then statements with respect to, e.g., attempts by a process to change memory in a certain way that is known to be malicious) and/or known non-malicious events. The events may collectively correlate to malicious behavior. The rules of the behavioral analysis logic 390 may then be correlated against those dynamic analysis results, as well as static analysis results, to generate correlation information pertaining to, e.g., a level of risk or a numerical score used to arrive at a decision of maliciousness.

The classifier 395 may be configured to use the correlation information provided by behavioral analysis logic 390 to render a decision as to whether the object 335 is malicious. Illustratively, the classifier 395 may be configured to classify the correlation information, including monitored behaviors (expected and unexpected/anomalous) and access violations, of the object 335 relative to those of known malware and benign content.

Periodically or aperiodically, rules may be pushed from the MDS 140 ₂ to the endpoint device 140 ₃ to update the behavioral analysis logic 390, wherein the rules may be applied as different behaviors and monitored. For example, the correlation rules pushed to the behavioral analysis logic 390 may include, for example, rules that specify a level of probability of maliciousness, requests to close certain network ports that are ordinarily used by an application program, and/or attempts to disable certain functions performed by the network adapter. Alternatively, the correlation rules may be pulled based on a request from an endpoint device 140 ₃ to determine whether new rules are available, and in response, the new rules are downloaded.

Illustratively, the behavioral analysis logic 390 and classifier 395 may be implemented as separate modules although, in the alternative, the behavioral analysis logic 390 and classifier 395 may be implemented as a single module disposed over (i.e., running on top of) the micro-hypervisor 360. The behavioral analysis logic 390 may be configured to correlate observed behaviors (e.g., results of static and dynamic analysis) with known malware and/or benign objects (embodied as defined rules) and generate an output (e.g., a level of risk or a numerical score associated with an object) that is provided to and used by the classifier 395 to render a decision of malware based on the risk level or score exceeding a probability threshold. The reporting module 336, which executes as a user mode process in the guest OS 300, is configured to generate an alert for transmission external to the endpoint device 140 ₃ (e.g., to one or more other endpoint devices, a management appliance, or MDS 140 ₂) in accordance with “post-solution” activity.

IV. Compromised Guest OS Kernel Detection and OS Recovery

According to one embodiment of the disclosure, the virtualization layer provides enhanced detection of a compromised software component (e.g., guest OS 300) operating within a virtual machine. The guest OS 300 is considered “compromised” when, due on a malicious attack, the functionality of the guest OS kernel 301 has been altered to disrupt or completely disable external network connectivity for the endpoint device. Also, the guest OS 300 may be considered “compromised” when an attacker has managed to take control of the guest OS kernel 301 and altering functionality (e.g., disabling network connectivity, etc.).

After detection, the virtualization layer 185 is configured to halt operability of the compromised (active) guest OS 300 and reconfigure the IOMMU 250 to assign some or all of the network devices, formerly driven by the guest OS 300 of the first virtual machine 170, to now be driven by the recovery OS 310 of the second virtual machine 175. Thereafter, the second virtual machine 175 undergoes a boot process, which initializes this virtual platform and places all of the network devices into a trustworthy state. Now, the external network connectivity for the endpoint device, as driven by the recovery OS 310 of the second virtual machine 175, is in operation. The first virtual machine 170 may undergo a graceful handoff (takeover) to allow the first virtual machine 170 to complete its analysis and to save state upon such completion which may be used in forensic analysis to determine when and how the guest OS 300 was compromised.

There may be a variety of techniques for detecting the change in functionality of the guest OS 300 that constitutes an attempted disruption or a disabling of external network connectivity from the endpoint device. In response, the virtualization layer alters an operating state of the second virtual machine 175 with the recovery OS 310.

As shown in FIG. 4A, a first technique involves the OS evaluation logic of the master controller component 372 transmitting a message destined to the network adapter 304 via the guest agent (not shown) to acquire state information from the guest OS 300 (see operations 400-403). The state information 400-403 may include, but is not limited or restricted to the current operating state of the network adapter such as the presence or absence of keepalive network packets, presence or absence of network interrupts, or information from statistical registers in the network adapter, as described above. Upon receipt of the state information, the master controller component 372 determines, in accordance with the policy rules governing operability of the network adapter (network adapter 304), whether the guest OS 300 has been compromised (operation 405).

When the state information indicates that there is a high likelihood that the guest OS 300 has been compromised, the master controller component 372 may be configured to signal the guest monitor component 374 to halt operations of the first virtual machine 170 (operations 410-411). Additionally, the guest monitor component 374 may secure a copy of the actual state of the first virtual machine as a snapshot (operation 412). The master controller component 372, which is responsible for policy decisions as to device resources, may request the micro-hypervisor 360 to reconfigure the IOMMU 250 (operation 415).

Thereafter, the micro-hypervisor 360 reassigns the network devices and the device resources (e.g., device registers, memory registers, etc.) to the recovery OS 310 (operation 420). As described, the recovery OS 310 may be deployed in a different virtual machine than the guest OS 300 or may be merely substituted for the guest OS 300 and corresponding guest application(s). After such reassignment, the virtualization layer (e.g., guest monitor component 374 and/or master controller component 372 and/or micro-hypervisor 360) boots the virtual machine under control by the recovery OS 310 to subsequently establish network connectivity through one or more external communication channels with a computing device remotely located from the endpoint device 140 ₃ (operations 425-426).

Referring now to FIG. 4B, a second technique for detecting that the first guest OS is compromised with subsequent OS recovery is shown. Herein, the master controller component 372 prompts the guest monitor component 374 to obtain state information from the network adapter 304 that drives the physical network adapter (operations 430-432). Responsive to receipt of the state information, the guest monitor component 374 transmits at least a portion of the state information to the threat protection component 376, which analyzes the state information to determine whether the state information suggests that the first guest OS is compromised (operations 435-437).

Upon receipt of the results of the analysis by the threat protection component 376, if the results identify that there is a high likelihood that the guest OS 300 has been compromised, the master controller component 372 may be configured to signal the guest monitor to halt operations of the first virtual machine and obtain a copy of the actual state of the first virtual machine as a snapshot (operations 440-442). Additionally, the master controller component 372 may request the micro-hypervisor 360 to reconfigure the IOMMU 250 (operation 445).

After receiving a request from the master controller component 372, the micro-hypervisor 360 reconfigures the IOMMU 250, which reassigns the network devices and the device resources (e.g., device registers, memory registers, etc.) to the recovery OS 310 (operation 450). After such reassignment, the virtualization layer boots the virtual machine under control by the recovery OS 310, which subsequently establishes network connectivity through one or more external communication channels with a computing device remotely located from the endpoint device 140 ₃ (operations 455-457).

Referring to FIG. 5, an exemplary embodiment of operations for detecting loss of network connectivity caused by a compromised guest OS and conducting an OS recovery response to re-establish network connectivity is shown. Herein, a first determination is made by the virtualization layer whether external network connectivity for the computing device has been disabled (item 500). This determination may be accomplished by monitoring a state of the network adapter through periodic (heartbeat) messages or accessing certain statistical registers associated with the network adapter for example.

If external network connectivity for a computing device has been disabled, a second determination may be conducted as to the reasons as to why the external network connectivity has been disabled (item 510). This determination may involve an analysis of one or more events, as captured by the guest agent process, that lead up to loss of external network connectivity in order to confirm that the external network connectivity was disabled due to operations conducted by the guest OS. Otherwise, if loss of the external network connectivity is due to a hardware failure or activities that are unrelated to the guest OS, the analysis discontinues.

Upon determining that external network connectivity has been disabled due to operations conducted by the guest OS (perhaps after attempts to re-enable the external network connectivity), the virtualization layer concludes that the guest OS is compromised. Hence, state information (data associated with the operating state of the guest OS) may be captured and the operations of the first virtual machine (with the guest OS) are halted (items 520 and 530).

Thereafter, a dormant recovery OS that is resident in non-transitory storage medium as an OS image may be fetched and installed into a selected virtual machine (item 540). The selected virtual machine may be the first virtual machine (where the recovery OS is substituted for the guest OS) or may be a second virtual machine different from the first virtual machine. Thereafter, the network device resources (and network devices that are currently driven by the guest OS kernel of the first virtual machine) are re-assigned to the recovery OS (item 550). Thereafter, the (second) virtual machine is booted, which causes the recovery OS to run and configure its network adapter to establish external network connectivity so that the endpoint device may electronically communicate with other computing devices located remotely from the endpoint device (items 560 and 570). This allows for the transmission of reports and/or alert messages over a network, which may identify one or more malicious event that is detected during virtual processing of an object under test.

In the foregoing description, the invention is described with reference to specific exemplary embodiments thereof. For instance, the guest OS and the recovery OS may be deployed on the same virtual machine, where the recovery OS remains dormant as a standby OS unless the guest OS is compromised. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims. 

What is claimed is:
 1. A computing device comprising: one or more hardware processors; and a memory coupled to the one or more hardware processors, the memory comprises one or more software components that, when executed by the one or more hardware processors, operate as (i) a visualization layer deployed in a host environment of a virtualization software architecture and (ii) a plurality of virtual machines deployed within a guest environment of the virtualization software architecture, the plurality of virtual machines comprises (a) a first virtual machine that is operating under control of a first operating system and including an agent collecting runtime state information of a network adapter and (b) a second virtual machine that is separate from the first virtual machine and is operating under control of a second operating system in response to determining that the first operating system has been compromised, the second virtual machine being configured to drive the network adapter, wherein after receipt of the state information by the virtualization layer, transmitting at least a portion of the state information to a threat protection component being deployed within the virtualization layer, analyzing, by the threat protection component, the state information to determine whether the first operating system is compromised by at least determining whether (i) an external network connection through the network adapter has been disabled or (ii) a kernel of the first operating system is attempting to disable the external network connection through the network adapter, and upon receipt of the results of the analyzing by the threat protection component that the first operating system is compromised, signaling, by the virtualization layer, to halt operations of the first virtual machine, installing, by the virtualization layer, a second operating system image retained within the memory of the computing device into the second virtual machine, reassigning, by the virtualization layer, the network adapter and adapter resources to the second operating system, the second virtual machine configured to drive the network adapter, and booting the second virtual machine subsequent to the reassignment of the network adapter and the adapter resources from the first operating system to the second operating system.
 2. The computing device of claim 1, wherein the network adapter is configured to establish an external network connection to another computing device.
 3. The computing device of claim 1, wherein the memory comprises software, including the one or more software components that, when executed by the one or more hardware processors, operates as the virtualization software architecture that comprises the guest environment including the first virtual machine and the host environment including the virtualization layer that analyzes data provided from the first virtual machine to determine whether the first operating system has been compromised.
 4. The computing device of claim 3, wherein the virtualization layer in the host environment comprises (1) a guest monitor component that determines whether an event, received from a process running on the first virtual machine that is configured to monitor operability of the network adapter, is directed to disabling or disrupting functionality of the network adapter and (2) a threat protection component that determines that the first operating system is compromised if the event is classified as malicious.
 5. The computing device of claim 4, wherein an event of the one or more events is classified as malicious upon determining that the event represents that an external network connection via the network adapter has been disabled.
 6. The computing device of claim 4, wherein the event is classified as malicious upon determining that a kernel of the first operating system is attempting to disable the external network connection via the network adapter.
 7. The computing device of claim 3, wherein the virtualization layer in the host environment comprises a threat protection component that determines that the first operating system is compromised when the one or more events is classified as malicious upon determining that the first operating system is non-functional.
 8. The computing device of claim 3, wherein the virtualization layer in the host environment comprises a threat protection component that determines that the first operating system (OS) is compromised when the one or more events is classified as malicious upon determining that a guest OS application of the first operating system is inoperable.
 9. The computing device of claim 1, wherein the second virtual machine is configured by removal of a first operating system (OS) kernel and one or more guest OS applications of the first operating system and installation of a second OS kernel and one or more guest OS applications of the second operating system.
 10. The computing device of claim 1, wherein the first virtual machine transitioning from an active state to an inactive state when the first operating system is determined to be compromised.
 11. The computing device of claim 1, wherein the first operating system is a different type of operating system than the second operating system.
 12. The computing device of claim 1, wherein the network adapter corresponds to a software-emulated data transfer device.
 13. A non-transitory storage medium that includes software that is executable by one or more processors and, upon execution, operates a virtualization software architecture, the non-transitory storage medium comprising: one or more software components that, when executed by the one or more processors, operate as a network adapter; one or more software components that, when executed by the one or more processors, operate as a virtualization layer; one or more software components that, when executed by the one or more processors, operate as a first virtual machine being part of the virtualization software architecture, the first virtual machine operating under control of a first operating system and including an agent collecting runtime state information of a network adapter; and one or more software components that, when executed by the one or more processors, operate as a second virtual machine being part of the virtualization software architecture, the second virtual machine operating under control of a second operating system in response to determining that the first operating system has been compromised in which functionality of the first operating system is determined to have been altered or network connectivity by the first virtual machine has been disabled, wherein after receipt of the state information by the virtualization layer, transmitting at least a portion of the state information to a threat protection component being deployed within the virtualization layer, analyzing, by the threat protection component, the state information to determine whether the first operating system is compromised by at least determining whether (i) an external network connection through the network adapter has been disabled or (ii) a kernel of the first operating system is attempting to disable the external network connection through the network adapter, and upon receipt of the results of the analyzing by the threat protection component that the first operating system is compromised, signaling, by the virtualization layer, to halt operations of the first virtual machine, installing, by the virtualization layer, a second operating system image retained within the memory of the computing device into the second virtual machine, reassigning, by the virtualization layer, the network adapter and adapter resources to the second operating system, the second virtual machine configured to drive the network adapter, and booting the second virtual machine subsequent to the reassignment of the network adapter and the adapter resources from the first operating system to the second operating system.
 14. The non-transitory storage medium of claim 13, wherein the virtualization layer analyzes data provided from the first virtual machine to determine whether the first operating system has been compromised.
 15. The non-transitory storage medium of claim 14, wherein the virtualization layer determines that the first operating system has been compromised based on a state of functionality of the network adapter in communications with the first operating system of the first virtual machine.
 16. The non-transitory storage medium of claim 14, wherein the virtualization layer comprises (1) a guest monitor component that determines whether one or more events, which are received from a process running on the first virtual machine that is configured to monitor operability of a network adapter in communications with the first operating system of the first virtual machine, is malicious as being directed to disabling or disrupting functionality of the network adapter and (2) the threat protection component that determines that the first operating system is compromised if the one or more events are classified as malicious.
 17. The non-transitory storage medium of claim 16, wherein the threat protection component classifies the one or more events as malicious upon determining that the one or more events represent that external network connection via the network adapter has been disabled.
 18. The non-transitory storage medium of claim 16, wherein the threat protection component classifies the one or more events as malicious upon determining that either (i) the first operating system is non-functional or (ii) an operability of a guest OS application of the first operating system has ceased.
 19. The non-transitory storage medium of claim 16, wherein the threat protection component classifies the one or more events as malicious upon determining that a kernel of the first operating system is attempting to disable the external network connection via the network adapter.
 20. The non-transitory storage medium of claim 13, wherein the second virtual machine is configured by removal of a first operating system (OS) kernel and one or more guest OS applications of the first operating system and installation of a second OS kernel and one or more guest OS applications of the second operating system.
 21. The non-transitory storage medium of claim 13, wherein the first virtual machine is independent from the second virtual machine.
 22. The non-transitory storage medium of claim 13, wherein the second virtual machine is a reconfiguration of the first virtual machine.
 23. A computerized method for protecting connectivity of a computing device to an external network in response to a virtualization layer of the computing device detecting that a guest operating system of the computing device has been compromised by a potential malicious attack through malware, the method comprising: operating a first virtual machine under control of a first operating system, the first virtual machine in communication with a network adapter and an agent collecting runtime state information of a network adapter; responsive to receipt of the state information by the virtualization layer, transmitting at least a portion of the state information to a threat protection component being deployed within the virtualization layer operating within a host environment of the computing device; analyzing, by the threat protection component, the state information to determine whether the first operating system is compromised by at least determining whether (i) an external network connection through the network adapter has been disabled or (ii) a kernel of the first operating system is attempting to disable the external network connection through the network adapter; and upon receipt of the results of the analyzing by the threat protection component that the first operating system is compromised, signaling, by the virtualization layer, to halt operations of the first virtual machine, installing, by the virtualization layer, a second operating system image retained within memory of the computing device into a second virtual machine, the second virtual machine being separate from the first virtual machine and allocated to a different address space than allocated to the first virtual machine, reassigning, by the virtualization layer, the network adapter and adapter resources to the second operating system, the second virtual machine configured to drive the reassigned network adapter, and booting the second virtual machine subsequent to the reassignment of the network adapter and the adapter resources from the first operating system to the second operating system.
 24. The computerized method of claim 23 further comprising: conducting a boot process on the second virtual machine so that external network connectivity, as driven by the second operating system of the second virtual machine, is in operation.
 25. The computerized method of claim 23, wherein the virtualization layer determines that the first operating system has been compromised based on a state of functionality of the network adapter in communications with the first operating system of the first virtual machine.
 26. The computerized method of claim 23, wherein the virtualization layer comprises (1) a guest monitor component that determines whether one or more events corresponding to the state information being received from the agent configured to monitor operability of the network adapter, are being directed to disabling or disrupting functionality of the network adapter and (2) the threat protection component that determines that the first operating system is compromised if the one or more events are classified as malicious.
 27. The computerized method of claim 26, wherein the threat protection component classifies the one or more events as malicious upon determining that the one or more events represent that external network connection via the network adapter has been disabled.
 28. The computerized method of claim 27, wherein the threat protection component classifies the one or more events as malicious upon determining that either (i) the first operating system is non-functional or (ii) an operability of a guest operating system (OS) application of the first operating system has ceased.
 29. The computerized method of claim 26, wherein the threat protection component classifies the one or more events as malicious upon determining that the kernel of the first operating system is attempting to disable the external network connection via the network adapter.
 30. The computerized method of claim 23, wherein the second virtual machine is configured by removal of at least the kernel of the first operating system and installation of a kernel of the second operating system. 